An Intrinsic Stopping Criterion for Committee-Based Active Learning
نویسندگان
چکیده
As supervised machine learning methods are increasingly used in language technology, the need for high-quality annotated language data becomes imminent. Active learning (AL) is a means to alleviate the burden of annotation. This paper addresses the problem of knowing when to stop the AL process without having the human annotator make an explicit decision on the matter. We propose and evaluate an intrinsic criterion for committee-based AL of named entity recognizers.
منابع مشابه
Stopping Criteria for Active Learning of Named Entity Recognition
Active learning is a proven method for reducing the cost of creating the training sets that are necessary for statistical NLP. However, there has been little work on stopping criteria for active learning. An operational stopping criterion is necessary to be able to use active learning in NLP applications. We investigate three different stopping criteria for active learning of named entity recog...
متن کاملMulti-Criteria-Based Strategy to Stop Active Learning for Data Annotation
In this paper, we address the issue of deciding when to stop active learning for building a labeled training corpus. Firstly, this paper presents a new stopping criterion, classification-change, which considers the potential ability of each unlabeled example on changing decision boundaries. Secondly, a multi-criteriabased combination strategy is proposed to solve the problem of predefining an a...
متن کاملA stopping criterion for active learning
Active learning (AL) is a framework that attempts to reduce the cost of annotating training material for statistical learning methods. While a lot of papers have been presented on applying AL to natural language processing tasks reporting impressive savings, little work has been done on defining a stopping criterion. In this work, we present a stopping criterion for active learning based on the...
متن کاملLearning a Stopping Criterion for Active Learning for Word Sense Disambiguation and Text Classification
In this paper, we address the problem of knowing when to stop the process of active learning. We propose a new statistical learning approach, called minimum expected error strategy, to defining a stopping criterion through estimation of the classifier’s expected error on future unlabeled examples in the active learning process. In experiments on active learning for word sense disambiguation and...
متن کاملUsing Variance as a Stopping Criterion for Active Learning of Frame Assignment
Active learning is a promising method to reduce human’s effort for data annotation in different NLP applications. Since it is an iterative task, it should be stopped at some point which is optimum or near-optimum. In this paper we propose a novel stopping criterion for active learning of frame assignment based on the variability of the classifier’s confidence score on the unlabeled data. The im...
متن کامل